Promoting Flexible Translations in Statistical Machine Translation
نویسنده
چکیده
While SMT systems can learn to translate multiword expressions (MWEs) from parallel text, they typically have no notion of non-compositionality, and thus overgeneralise translations that are only used in certain contexts. This paper describes a novel approach to measure the flexibility of a phrase pair, i.e. its tendency to occur in many contexts, in contrast to phrase pairs that are only valid in one or a few fixed expressions. The measure learns from the parallel training text, is simple to implement and language independent. We argue that flexible phrase pairs should be preferred over inflexible ones, and present experiments with phrase-based and hierarchical translation models in which we observe performance gains of up to 0.9 BLEU points. Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-81049 Published Version Originally published at: Sennrich, Rico (2013). Promoting Flexible Translations in Statistical Machine Translation. In: Proceedings of the XIV Machine Translation Summit, Nice, 2 September 2013 6 September 2013, 207-214. Promoting Flexible Translations in Statistical Machine Translation Rico Sennrich Institute of Computational Linguistics University of Zurich Binzmühlestr. 14 CH-8050 Zürich
منابع مشابه
The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملEfficient Statistical Machine Translation with Constrained Reordering
This paper describes how word alignment information makes machine translation more efficient. Following a statistical approach based on finite-state transducers, we perform reordering of source sentences in training using automatic word alignments and estimate a phrase-based translation model. Using this model, we translate monotonically taking a permutation graph as input. The permutation grap...
متن کامل'Poetic' Statistical Machine Translation: Rhyme and Meter
As a prerequisite to translation of poetry, we implement the ability to produce translations with meter and rhyme for phrase-based MT, examine whether the hypothesis space of such a system is flexible enough to accomodate such constraints, and investigate the impact of such constraints on translation quality.
متن کاملEvaluating Statistical Machine Translation from English to Dutch
In this paper, I attempt to evaluate the effectiveness of using statistical machine translation to translate an English text into Dutch, using empirical evaluation and the Bleu evaluation metric. I also give a brief overview of the theory behind statistical machine translation and automated translation evaluation metrics. I have translated a sample of the English proceedings of the European Par...
متن کاملMultilingual Mobile-Phone Translation Services for World Travelers
This demonstration introduces two new multilingual translation services for mobile phones. The first translation service provides state-of-the-art text-to-text translations of Japanese as well as English conversational spoken language in the travel domain into 17 languages using statistical machine translation technologies trained automatically from a large-scale multilingual corpus. The second...
متن کامل